Dual Comparison

The Dual Comparison tab allows users to evaluate two different validations side by side. This comparison helps identify changes in model performance across different datasets, annotation sets, or model versions. The interface includes metrics, confusion matrices, and performance distributions to support deep analysis.

1. Validation Selection

Users can select two validations:

Base Validation: The primary reference point
Compared Validation: The second run for comparison

Each selection displays metadata, including:

Model Name & Type
Validation Name
Dataset & Modality
Annotation Set
Sample Count (Total, Done, Failed)
Tags and Source Info

2. Key Performance Metrics

Side-by-side metrics help users assess both models at a glance:

Accuracy
True Positives (TP)
True Negatives (TN)
False Positives (FP)
False Negatives (FN)
Sensitivity
Specificity
Precision

3. Class Performance

This section compares the performance of each class between validations:

Per-class Accuracy
Sensitivity and Specificity
Precision and F1 Score

Charts highlight any performance degradation or improvement between versions or datasets.

4. Confidence Threshold Evaluation

Users can compare how both models behave at different confidence thresholds:

Performance metrics shift as thresholds change
Useful for threshold tuning and model calibration

5. Distribution Visualizations

Several comparative charts are available:

Prediction Distribution: Number of predicted cases per class
Confidence Distribution: How confident each model was for each class
Dataset Class Distribution: Breakdown of label frequencies in each dataset
ROC & Precision-Recall Curves: Model discrimination ability across thresholds
Lift Chart: (if available) Measures class separation efficiency

6. Confusion Matrix

Confusion matrices for both validations are shown side-by-side:

Helps identify class confusion
Highlights misclassification patterns

7. Mismatched Predictions

This table highlights samples where the two models disagreed:

Useful for error analysis
Reveals model-specific blind spots
Enables targeted review of problematic predictions

✅ Tip: Use Dual Comparison to validate changes between model versions, track dataset impact, and gain confidence in model updates before deployment.

1. Validation Selection​

2. Key Performance Metrics​

3. Class Performance​

4. Confidence Threshold Evaluation​

5. Distribution Visualizations​

6. Confusion Matrix​

7. Mismatched Predictions​